11 research outputs found

    Change-point detection in high-dimensional covariance structure

    Get PDF
    In this paper we introduce a novel approach for an important problem of break detection. Specifically, we are interested in detection of an abrupt change in the covariance structure of a high-dimensional random process ? a problem, which has applications in many areas e.g., neuroimaging and finance. The developed approach is essentially a testing procedure involving a choice of a critical level. To that end a non-standard bootstrap scheme is proposed and theoretically justified under mild assumptions. Theoretical study features a result providing guaranties for break detection. All the theoretical results are established in a high-dimensional setting (dimensionality p n). Multiscale nature of the approach allows for a trade-off between sensitivity of break detection and localization. The approach can be naturally employed in an on-line setting. Simulation study demonstrates that the approach matches the nominal level of false alarm probability and exhibits high power, outperforming a recent approach

    Change-point detection in high-dimensional covariance structure

    Get PDF
    In this paper we introduce a novel approach for an important problem of break detection. Specifically, we are interested in detection of an abrupt change in the covariance structure of a high-dimensional random process – a problem, which has applications in many areas e.g., neuroimaging and finance. The developed approach is essentially a testing procedure involving a choice of a critical level. To that end a non-standard bootstrap scheme is proposed and theoretically justified under mild assumptions. Theoretical study features a result providing guaranties for break detection. All the theoretical results are established in a high-dimensional setting (dimensionality p≫n). Multiscale nature of the approach allows for a trade-off between sensitivity of break detection and localization. The approach can be naturally employed in an on-line setting. Simulation study demonstrates that the approach matches the nominal level of false alarm probability and exhibits high power, outperforming a recent approach

    BRUL\`E: Barycenter-Regularized Unsupervised Landmark Extraction

    Full text link
    Unsupervised retrieval of image features is vital for many computer vision tasks where the annotation is missing or scarce. In this work, we propose a new unsupervised approach to detect the landmarks in images, validating it on the popular task of human face key-points extraction. The method is based on the idea of auto-encoding the wanted landmarks in the latent space while discarding the non-essential information (and effectively preserving the interpretability). The interpretable latent space representation (the bottleneck containing nothing but the wanted key-points) is achieved by a new two-step regularization approach. The first regularization step evaluates transport distance from a given set of landmarks to some average value (the barycenter by Wasserstein distance). The second regularization step controls deviations from the barycenter by applying random geometric deformations synchronously to the initial image and to the encoded landmarks. We demonstrate the effectiveness of the approach both in unsupervised and semi-supervised training scenarios using 300-W, CelebA, and MAFL datasets. The proposed regularization paradigm is shown to prevent overfitting, and the detection quality is shown to improve beyond the state-of-the-art face models.Comment: 10 main pages with 6 figures and 1 Table, 14 pages total with 6 supplementary figures. I.B. and N.B. contributed equally. D.V.D. is corresponding autho

    Unsupervised non-parametric change point detection in quasi-periodic signals

    Full text link
    We propose a new unsupervised and non-parametric method to detect change points in intricate quasi-periodic signals. The detection relies on optimal transport theory combined with topological analysis and the bootstrap procedure. The algorithm is designed to detect changes in virtually any harmonic or a partially harmonic signal and is verified on three different sources of physiological data streams. We successfully find abnormal or irregular cardiac cycles in the waveforms for the six of the most frequent types of clinical arrhythmias using a single algorithm. The validation and the efficiency of the method are shown both on synthetic and on real time series. Our unsupervised approach reaches the level of performance of the supervised state-of-the-art techniques. We provide conceptual justification for the efficiency of the method and prove the convergence of the bootstrap procedure theoretically.Comment: 8 pages, 7 figures, 1 tabl

    Landmarks Augmentation with Manifold-Barycentric Oversampling

    Full text link
    The training of Generative Adversarial Networks (GANs) requires a large amount of data, stimulating the development of new augmentation methods to alleviate the challenge. Oftentimes, these methods either fail to produce enough new data or expand the dataset beyond the original manifold. In this paper, we propose a new augmentation method that guarantees to keep the new data within the original data manifold thanks to the optimal transport theory. The proposed algorithm finds cliques in the nearest-neighbors graph and, at each sampling iteration, randomly draws one clique to compute the Wasserstein barycenter with random uniform weights. These barycenters then become the new natural-looking elements that one could add to the dataset. We apply this approach to the problem of landmarks detection and augment the available annotation in both unpaired and in semi-supervised scenarios. Additionally, the idea is validated on cardiac data for the task of medical segmentation. Our approach reduces the overfitting and improves the quality metrics beyond the original data outcome and beyond the result obtained with popular modern augmentation methods.Comment: 11 pages, 4 figures, 3 tables. I.B. and N.B. contributed equally. D.V.D. is the corresponding autho

    Bootstrap in high dimensional spaces

    Get PDF
    Ziel dieser Arbeit ist theoretische Eigenschaften verschiedener Bootstrap Methoden zu untersuchen. Als Ergebnis führen wir die Konvergenzraten des Bootstrap-Verfahrens ein, die sich auf die Differenz zwischen der tatsächlichen Verteilung einer Statistik und der Resampling-Näherung beziehen. In dieser Arbeit analysieren wir die Verteilung der l2-Norm der Summe unabhängiger Vektoren, des Summen Maximums in hoher Dimension, des Wasserstein-Abstands zwischen empirischen Messungen und Wassestein-Barycenters. Um die Bootstrap-Konvergenz zu beweisen, verwenden wir die Gaussche Approximations technik. Das bedeutet dass man in der betrachteten Statistik eine Summe unabhängiger Vektoren finden muss, so dass Bootstrap eine erneute Abtastung dieser Summe ergibt. Ferner kann diese Summe durch Gaussche Verteilung angenähert und mit der Neuabtastung Verteilung als Differenz zwischen Kovarianzmatrizen verglichen werden. Im Allgemeinen scheint es sehr schwierig zu sein, eine solche Summe unabhängiger Vektoren aufzudecken, da einige Statistiken (zum Beispiel MLE) keine explizite Gleichung haben und möglicherweise unendlich dimensional sind. Um mit dieser Schwierigkeit fertig zu werden, verwenden wir einige neuartige Ergebnisse aus der statistischen Lerntheorie. Darüber hinaus wenden wir Bootstrap bei Methoden zur Erkennung von Änderungspunkten an. Im parametrischen Fall analysieren wir den statischen Likelihood Ratio Test (LRT). Seine hohen Werte zeigen Änderungen der Parameter Verteilung in der Datensequenz an. Das Maximum von LRT hat eine unbekannte Verteilung und kann mit Bootstrap kalibriert werden. Wir zeigen die Konvergenzraten zur realen maximalen LRT-Verteilung. In nicht parametrischen Fällen verwenden wir anstelle von LRT den Wasserstein-Abstand zwischen empirischen Messungen. Wir testen die Genauigkeit von Methoden zur Erkennung von Änderungspunkten anhand von synthetischen Zeitreihen und Elektrokardiographiedaten. Letzteres zeigt einige Vorteile des nicht parametrischen Ansatzes gegenüber komplexen Modellen und LRT.The objective of this thesis is to explore theoretical properties of various bootstrap methods. We introduce the convergence rates of the bootstrap procedure which corresponds to the difference between real distribution of some statistic and its resampling approximation. In this work we analyze the distribution of Euclidean norm of independent vectors sum, maximum of sum in high dimension, Wasserstein distance between empirical measures, Wassestein barycenters. In order to prove bootstrap convergence we involve Gaussian approximation technique which means that one has to find a sum of independent vectors in the considered statistic such that bootstrap yields a resampling of this sum. Further this sum may be approximated by Gaussian distribution and compared with the resampling distribution as a difference between variance matrices. In general it appears to be very difficult to reveal such a sum of independent vectors because some statistics (for example, MLE) don't have an explicit equation and may be infinite-dimensional. In order to handle this difficulty we involve some novel results from statistical learning theory, which provide a finite sample quadratic approximation of the Likelihood and suitable MLE representation. In the last chapter we consider the MLE of Wasserstein barycenters model. The regularised barycenters model has bounded derivatives and satisfies the necessary conditions of quadratic approximation. Furthermore, we apply bootstrap in change point detection methods. In the parametric case we analyse the Likelihood Ratio Test (LRT) statistic. Its high values indicate changes of parametric distribution in the data sequence. The maximum of LRT has a complex distribution but its quantiles may be calibrated by means of bootstrap. We show the convergence rates of the bootstrap quantiles to the real quantiles of LRT distribution. In non-parametric case instead of LRT we use Wasserstein distance between empirical measures. We test the accuracy of change point detection methods on synthetic time series and electrocardiography (ECG) data. Experiments with ECG illustrate advantages of the non-parametric approach versus complex parametric models and LRT

    Change-point detection in high-dimensional covariance structure

    No full text
    In this paper we introduce a novel approach for an important problem of break detection. Specifically, we are interested in detection of an abrupt change in the covariance structure of a high-dimensional random process – a problem, which has applications in many areas e.g., neuroimaging and finance. The developed approach is essentially a testing procedure involving a choice of a critical level. To that end a non-standard bootstrap scheme is proposed and theoretically justified under mild assumptions. Theoretical study features a result providing guaranties for break detection. All the theoretical results are established in a high-dimensional setting (dimensionality p≫n). Multiscale nature of the approach allows for a trade-off between sensitivity of break detection and localization. The approach can be naturally employed in an on-line setting. Simulation study demonstrates that the approach matches the nominal level of false alarm probability and exhibits high power, outperforming a recent approach
    corecore